nct (Noncentral t) distribution#
The noncentral t distribution is what you get when a t-statistic is evaluated under an alternative hypothesis (i.e., when the true mean is not equal to the null mean). It generalizes Student’s t by introducing a noncentrality parameter that shifts the underlying normal component and typically induces skewness.
A clean way to remember it:
central t: \(T = Z/\sqrt{V/\nu}\) with \(Z\sim\mathcal N(0,1)\)
noncentral t: \(T = (Z+\delta)/\sqrt{V/\nu}\) with \(Z\sim\mathcal N(0,1)\)
where \(V\sim\chi^2_{\nu}\) is independent of \(Z\).
Learning goals#
Classify and parameterize
nct: support, parameters, and limiting cases.Use the stochastic construction to derive mean/variance and understand moment existence.
Implement NumPy-only sampling from first principles.
Visualize PDF/CDF and validate Monte Carlo simulations.
Use
scipy.stats.nctforpdf,cdf,rvs, andfit.
import platform
import numpy as np
import plotly.graph_objects as go
import os
import plotly.io as pio
from plotly.subplots import make_subplots
import scipy
from scipy import optimize, special
from scipy.stats import chi2, nct, norm, t
pio.templates.default = "plotly_white"
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")
np.set_printoptions(precision=4, suppress=True)
rng = np.random.default_rng(7)
print("Python", platform.python_version())
print("NumPy", np.__version__)
print("SciPy", scipy.__version__)
Python 3.12.9
NumPy 1.26.2
SciPy 1.15.0
1) Title & classification#
Name:
nct(noncentral t distribution)Type: continuous distribution
Support: \(x \in (-\infty, \infty)\)
Parameter space:
degrees of freedom \(\nu > 0\)
noncentrality \(\delta \in \mathbb R\)
We write:
In SciPy, the standardized form is scipy.stats.nct(df=nu, nc=delta) (with optional loc and scale).
2) Intuition & motivation#
2.1 What it models#
The noncentral t distribution models a signal-to-noise ratio with estimated noise. It is the sampling distribution of the usual t-statistic when the true mean is not the null mean.
A one-sample setup makes this concrete:
data: \(X_1,\dots,X_n \stackrel{\text{iid}}{\sim} \mathcal N(\mu,\sigma^2)\)
test statistic: \(T = \dfrac{\bar X - \mu_0}{S/\sqrt{n}}\)
If \(\mu \neq \mu_0\), then
2.2 Typical real-world use cases#
Power analysis for t-tests (probability of rejecting the null under a specific effect size).
A/B testing with approximately normal outcomes (means) and unknown variance.
Design of experiments: translate a practically meaningful effect into power curves.
Effect size modeling: sampling distribution of standardized mean differences.
2.3 Relations to other distributions#
Student’s t: \(\delta=0\) gives the central t distribution with \(\nu\) degrees of freedom.
Normal limit: as \(\nu\to\infty\), \(V/\nu\to 1\) and \(T\Rightarrow \mathcal N(\delta,1)\).
Noncentral F: if \(T\sim\mathrm{NCT}(\nu,\delta)\), then \(T^2 \sim \mathrm{NCF}(1,\nu,\lambda=\delta^2)\).
3) Formal definition#
Let \(\phi\) and \(\Phi\) be the standard normal PDF/CDF:
Let \(Z\sim\mathcal N(0,1)\) and \(V\sim\chi^2_{\nu}\) be independent, with \(\nu>0\). Define
Then \(T\sim\mathrm{NCT}(\nu,\delta)\).
3.1 Conditional (mixture) view#
Conditioning on \(V=v\), we have \(\sqrt{V/\nu}=\sqrt{v/\nu}=s\) (a constant), so
3.2 PDF (integral representation)#
Write the chi-square density
Using the conditional normal density and integrating out \(V\) gives an explicit representation:
Closed forms exist in terms of special functions / infinite series; numerical libraries (SciPy) evaluate the PDF using specialized routines.
3.3 CDF (integral representation)#
Similarly, the conditional CDF is normal, so
SciPy exposes a dedicated special function for the CDF: scipy.special.nctdtr(df, nc, x).
3.4 Quantiles#
There is no simple closed form for the PPF \(F^{-1}(p)\); it is computed numerically (e.g. nct.ppf).
def nct_pdf(x: np.ndarray, df: float, nc: float) -> np.ndarray:
x = np.asarray(x, dtype=float)
return nct.pdf(x, df, nc)
def nct_logpdf(x: np.ndarray, df: float, nc: float) -> np.ndarray:
x = np.asarray(x, dtype=float)
return nct.logpdf(x, df, nc)
def nct_cdf(x: np.ndarray, df: float, nc: float) -> np.ndarray:
x = np.asarray(x, dtype=float)
# Same as nct.cdf, but explicitly shows the specialized implementation SciPy uses.
return special.nctdtr(df, nc, x)
4) Moments & properties#
A key structural fact from the definition is that
where \(Y\sim\mathcal N(\delta,1)\) and \(V\sim\chi^2_{\nu}\). This lets you compute moments by separating them:
4.1 Existence of moments#
Because \(S^k=(\nu/V)^{k/2}\), the moment \(\mathbb E[S^k]\) exists iff \(\nu>k\). Therefore:
mean exists for \(\nu>1\)
variance exists for \(\nu>2\)
skewness exists for \(\nu>3\)
kurtosis exists for \(\nu>4\)
4.2 Mean and variance#
Using independence:
Also
So
4.3 Skewness and kurtosis#
Raw moments of \(Y\sim\mathcal N(\delta,1)\) up to order 4 are
\(\mathbb E[Y]=\delta\)
\(\mathbb E[Y^2]=\delta^2+1\)
\(\mathbb E[Y^3]=\delta^3+3\delta\)
\(\mathbb E[Y^4]=\delta^4+6\delta^2+3\)
Combine them with
to get \(\mathbb E[T^k]\), then convert to central moments.
4.4 MGF and characteristic function#
MGF: like Student’s t,
ncthas polynomial tails, so the MGF does not exist (is infinite) for any nonzero argument.Characteristic function: it exists for all real \(\omega\) and can be written via the conditional normal form:
This expectation has closed forms in special functions, but is typically evaluated numerically.
4.5 Entropy#
There is no simple closed form for the differential entropy; in practice it is computed numerically (e.g. nct.entropy).
def _e_scaled_inv_chi_power(df: float, k: int) -> float:
"""E[(df/V)^(k/2)] for V ~ chi2(df). Exists iff df > k."""
if df <= k:
return np.nan
return (df / 2) ** (k / 2) * special.gamma((df - k) / 2) / special.gamma(df / 2)
def nct_moments_closed_form(df: float, nc: float):
"""Mean/variance/skewness/excess kurtosis from the stochastic representation.
Returns (mean, var, skew, exkurt). Values are NaN when the moment does not exist.
"""
# Moments of Y ~ Normal(nc, 1)
y1 = nc
y2 = nc**2 + 1.0
y3 = nc**3 + 3.0 * nc
y4 = nc**4 + 6.0 * nc**2 + 3.0
s1 = _e_scaled_inv_chi_power(df, 1)
s2 = _e_scaled_inv_chi_power(df, 2)
s3 = _e_scaled_inv_chi_power(df, 3)
s4 = _e_scaled_inv_chi_power(df, 4)
m1 = y1 * s1 if df > 1 else np.nan
m2 = y2 * s2 if df > 2 else np.nan
m3 = y3 * s3 if df > 3 else np.nan
m4 = y4 * s4 if df > 4 else np.nan
if not np.isfinite(m1) or not np.isfinite(m2):
return m1, np.nan, np.nan, np.nan
var = m2 - m1**2
if not np.isfinite(var) or var <= 0:
return m1, var, np.nan, np.nan
if not np.isfinite(m3):
return m1, var, np.nan, np.nan
mu3 = m3 - 3 * m1 * m2 + 2 * m1**3
skew = mu3 / (var ** 1.5)
if not np.isfinite(m4):
return m1, var, skew, np.nan
mu4 = m4 - 4 * m1 * m3 + 6 * (m1**2) * m2 - 3 * m1**4
exkurt = mu4 / (var**2) - 3
return m1, var, skew, exkurt
# Quick numeric cross-check vs SciPy (avoid nc==0 exactly due to a corner-case in some SciPy versions)
for df, nc in [(10.0, 1.2), (6.0, 0.1), (3.5, -1.0)]:
m, v, s, k = nct_moments_closed_form(df, nc)
m_s, v_s, s_s, k_s = nct.stats(df, nc, moments="mvsk")
print(f"df={df:>4}, nc={nc:>5} | closed-form mvsk = {m: .6f}, {v: .6f}, {s: .6f}, {k: .6f}")
print(f" scipy mvsk = {m_s: .6f}, {v_s: .6f}, {s_s: .6f}, {k_s: .6f}")
df=10.0, nc= 1.2 | closed-form mvsk = 1.300467, 1.358786, 0.472341, 1.349288
scipy mvsk = 1.300467, 1.358786, 0.472341, 1.349288
df= 6.0, nc= 0.1 | closed-form mvsk = 0.115124, 1.501746, 0.093929, 3.017647
scipy mvsk = 0.115124, 1.501746, 0.093929, 3.017647
df= 3.5, nc= -1.0 | closed-form mvsk = -1.304653, 2.964547, -4.448490, nan
scipy mvsk = -1.304653, 2.964547, -4.448490, nan
5) Parameter interpretation#
5.1 Degrees of freedom \(\nu\)#
Controls tail heaviness (smaller \(\nu\) ⇒ heavier tails, more extreme values).
Determines which moments exist: the \(k\)-th moment requires \(\nu>k\).
As \(\nu\to\infty\), the randomness in \(\sqrt{V/\nu}\) vanishes and the distribution approaches \(\mathcal N(\delta,1)\).
5.2 Noncentrality \(\delta\)#
Shifts the underlying normal component \(Z+\delta\).
Typically induces skewness (unless \(\delta=0\), where the distribution is symmetric).
In hypothesis testing, \(\delta\) is a standardized effect size multiplied by \(\sqrt{n}\).
We’ll visualize these effects next.
x = np.linspace(-8, 8, 1200)
fig = make_subplots(
rows=1,
cols=2,
subplot_titles=("Vary df (nc fixed)", "Vary nc (df fixed)"),
)
# Left: varying df
nc_fixed = 1.5
for df in [2.5, 5, 15, 80]:
fig.add_trace(
go.Scatter(x=x, y=nct_pdf(x, df, nc_fixed), name=f"df={df}, nc={nc_fixed}", mode="lines"),
row=1,
col=1,
)
# Right: varying nc
df_fixed = 10
for nc in [-3, -1.5, 0.0, 1.5, 3.0]:
fig.add_trace(
go.Scatter(x=x, y=nct_pdf(x, df_fixed, nc), name=f"df={df_fixed}, nc={nc}", mode="lines"),
row=1,
col=2,
)
fig.update_layout(height=420, width=980, title_text="Noncentral t: PDF shape changes")
fig.update_xaxes(title_text="x")
fig.update_yaxes(title_text="density")
fig.show()
6) Derivations#
6.1 Expectation#
Start from the stochastic construction
Then
since \(\mathbb E[Z]=0\). Because \(V\sim\chi^2_{\nu}=\mathrm{Gamma}(k=\nu/2,\theta=2)\), for \(\nu>1\)
Multiplying by \(\sqrt{\nu}\) gives
6.2 Variance#
Compute the second raw moment (exists for \(\nu>2\)):
since \(\mathbb E[(Z+\delta)^2]=1+\delta^2\) and \(\mathbb E[1/V]=1/(\nu-2)\) for \(V\sim\chi^2_{\nu}\). Then
6.3 Likelihood#
For i.i.d. observations \(t_1,\dots,t_n\) from \(\mathrm{NCT}(\nu,\delta)\), the likelihood is
and the log-likelihood is
There is no closed-form MLE for \((\nu,\delta)\) in general; numerical optimization is used (e.g. nct.fit).
def nct_neg_loglik(params: np.ndarray, data: np.ndarray) -> float:
df, nc = float(params[0]), float(params[1])
if not np.isfinite(df) or df <= 0 or not np.isfinite(nc):
return np.inf
return -float(np.sum(nct_logpdf(data, df, nc)))
# Synthetic example: recover parameters via numerical MLE
true_df, true_nc = 8.0, 1.25
sample = nct.rvs(true_df, true_nc, size=1500, random_state=rng)
res = optimize.minimize(
nct_neg_loglik,
x0=np.array([10.0, 0.5]),
args=(sample,),
bounds=[(1e-6, None), (None, None)],
)
print("True (df, nc):", (true_df, true_nc))
print("MLE (df, nc):", tuple(res.x))
print("Success:", res.success)
True (df, nc): (8.0, 1.25)
MLE (df, nc): (8.194098582508634, 1.2025931855233258)
Success: True
7) Sampling & simulation (NumPy-only)#
The construction \(T=(Z+\delta)/\sqrt{V/\nu}\) immediately gives a simple sampler:
Sample \(Z\sim\mathcal N(0,1)\).
Sample \(V\sim\chi^2_{\nu}\) independently.
Return \(T=(Z+\delta)/\sqrt{V/\nu}\).
NumPy can sample both normals and chi-square variables, so this is a true “from-scratch” simulator in the sense that it does not call scipy.stats.nct.rvs.
def nct_rvs_numpy(df: float, nc: float, size: int, rng: np.random.Generator) -> np.ndarray:
if df <= 0:
raise ValueError("df must be > 0")
z = rng.standard_normal(size) + nc
v = rng.chisquare(df, size=size)
return z / np.sqrt(v / df)
# Sanity-check: Monte Carlo mean/variance vs theory
df, nc = 10.0, 1.2
x_mc = nct_rvs_numpy(df, nc, size=300_000, rng=rng)
m_theory, v_theory, *_ = nct_moments_closed_form(df, nc)
print("MC mean/var:", float(x_mc.mean()), float(x_mc.var()))
print("Theory mean/var:", float(m_theory), float(v_theory))
MC mean/var: 1.302083304452847 1.356864236401281
Theory mean/var: 1.3004667695269725 1.35878618135608
8) Visualization#
We’ll plot:
the PDF and a Monte Carlo histogram
the CDF and an empirical CDF from samples
def ecdf(x: np.ndarray):
xs = np.sort(np.asarray(x, dtype=float))
ps = (np.arange(1, xs.size + 1) / xs.size)
return xs, ps
df, nc = 8.0, 1.5
x_grid = np.linspace(-8, 8, 1500)
samples = nct_rvs_numpy(df, nc, size=80_000, rng=rng)
fig = make_subplots(
rows=1,
cols=2,
subplot_titles=("PDF + Monte Carlo histogram", "CDF + empirical CDF"),
)
# PDF + histogram
fig.add_trace(
go.Histogram(
x=samples,
nbinsx=120,
histnorm="probability density",
name="samples",
opacity=0.55,
),
row=1,
col=1,
)
fig.add_trace(
go.Scatter(x=x_grid, y=nct_pdf(x_grid, df, nc), name="theoretical PDF", mode="lines"),
row=1,
col=1,
)
# CDF + ECDF
xs, ps = ecdf(samples)
fig.add_trace(
go.Scatter(x=x_grid, y=nct_cdf(x_grid, df, nc), name="theoretical CDF", mode="lines"),
row=1,
col=2,
)
fig.add_trace(
go.Scatter(x=xs[::50], y=ps[::50], name="empirical CDF", mode="markers", marker=dict(size=4)),
row=1,
col=2,
)
fig.update_layout(height=430, width=980, title_text=f"Noncentral t (df={df}, nc={nc})")
fig.update_xaxes(title_text="x")
fig.update_yaxes(title_text="density", row=1, col=1)
fig.update_yaxes(title_text="F(x)", row=1, col=2)
fig.show()
9) SciPy integration (scipy.stats.nct)#
SciPy provides the noncentral t distribution as scipy.stats.nct.
Key methods:
nct.pdf(x, df, nc),nct.logpdf(...)nct.cdf(x, df, nc),nct.sf(...)(survival)nct.rvs(df, nc, size=..., random_state=...)nct.fit(data, ...)(MLE)
SciPy also supports loc and scale, i.e. \(X = \mathrm{loc} + \mathrm{scale}\cdot T\).
df, nc = 12.0, -0.75
x = np.array([-2.0, 0.0, 2.0])
print("pdf:", nct.pdf(x, df, nc))
print("cdf:", nct.cdf(x, df, nc))
print("rvs:", nct.rvs(df, nc, size=5, random_state=rng))
print("entropy:", nct.entropy(df, nc))
# Fit example (standardized loc=0, scale=1)
data = nct.rvs(9.0, 1.1, size=4000, random_state=rng)
df_hat, nc_hat, loc_hat, scale_hat = nct.fit(data, floc=0.0, fscale=1.0)
print("fit df, nc:", df_hat, nc_hat)
pdf: [0.1774 0.2949 0.0125]
cdf: [0.1311 0.7734 0.9942]
rvs: [-2.426 -0.1903 -0.34 -0.0526 -1.8375]
entropy: 1.5144120934998153
fit df, nc: 9.19392060339923 1.1072185503404517
10) Statistical use cases#
10.1 Hypothesis testing: power of a t-test#
For a two-sided one-sample t-test at level \(\alpha\), the rejection region is
Under the alternative, \(T\sim\mathrm{NCT}(\nu,\delta)\) with \(\delta=\sqrt{n}(\mu-\mu_0)/\sigma\). So the power is
10.2 Bayesian modeling: predictive distribution of a t-statistic#
If you put a prior on the effect size (hence on \(\delta\)), then the predictive distribution of the t-statistic is a mixture of noncentral t distributions:
This mixture is rarely available in closed form, but it is straightforward to sample.
10.3 Generative modeling#
nct is a natural generator for t-statistics / z-like scores under alternatives.
From that you can simulate p-value distributions, power curves, and multiple-testing behavior.
def power_two_sided_t(n: int, alpha: float, mu_minus_mu0: float, sigma: float = 1.0) -> float:
df = n - 1
delta = np.sqrt(n) * (mu_minus_mu0 / sigma)
crit = t.isf(alpha / 2, df)
return float(nct.sf(crit, df, delta) + nct.cdf(-crit, df, delta))
alpha = 0.05
n = 20
sig = 1.0
effects = np.linspace(0.0, 1.2, 61)
powers = np.array([power_two_sided_t(n, alpha, eff, sigma=sig) for eff in effects])
fig = go.Figure(
data=[go.Scatter(x=effects, y=powers, mode="lines", name="power")],
layout=go.Layout(
title=f"Two-sided one-sample t-test power (n={n}, alpha={alpha})",
xaxis_title="effect (mu - mu0) in units of sigma",
yaxis_title="power",
yaxis=dict(range=[0, 1]),
),
)
fig.show()
# Bayesian-flavored example: predictive distribution of T under a prior on the standardized effect d
# d = (mu - mu0)/sigma, delta = sqrt(n)*d
n = 20
nu = n - 1
tau = 0.4 # prior std on standardized effect size d
m = 120_000
# Prior on effect size -> prior on delta
prior_d = rng.normal(0.0, tau, size=m)
prior_delta = np.sqrt(n) * prior_d
# Predictive sampling: sample delta, then sample T | delta ~ nct(nu, delta)
# (NumPy-only sampler using the nct construction)
z = rng.standard_normal(m) + prior_delta
v = rng.chisquare(nu, size=m)
t_pred = z / np.sqrt(v / nu)
a = 0.05
crit = t.isf(a / 2, nu)
print("Prior predictive P(reject H0):", float((np.abs(t_pred) > crit).mean()))
xg = np.linspace(-6, 6, 900)
fig = go.Figure()
fig.add_trace(
go.Histogram(
x=t_pred,
nbinsx=140,
histnorm="probability density",
name="prior-predictive samples",
opacity=0.6,
)
)
fig.add_trace(go.Scatter(x=xg, y=t.pdf(xg, nu), name="central t (delta=0)", mode="lines"))
fig.update_layout(
title=f"Prior predictive distribution of T (nu={nu}, prior d ~ N(0, {tau}^2))",
xaxis_title="t statistic",
yaxis_title="density",
)
fig.show()
Prior predictive P(reject H0): 0.3202083333333333
11) Pitfalls#
Parameter constraints: \(\nu>0\). Many summary statistics require stronger conditions (e.g. variance needs \(\nu>2\)).
Skewness and asymmetry: when \(\delta\neq 0\), the distribution is not symmetric; don’t reuse symmetric t heuristics.
Extreme tails: for small \(\nu\), Monte Carlo samples can be very large; use robust summaries (quantiles) and sufficient sample sizes.
Numerical issues: evaluating the PDF/CDF for very large \(|\delta|\) or very small \(\nu\) can be challenging; prefer library implementations (
special.nctdtr,nct.logpdf).MLE fitting:
nct.fitis a nonconvex numerical optimization; it can be sensitive to starting values and may converge to poor local optima.
12) Summary#
nctis a continuous distribution on \(\mathbb R\) with parameters \(\nu>0\) (degrees of freedom) and \(\delta\in\mathbb R\) (noncentrality).It is the distribution of a t-statistic under an alternative: \(T=(Z+\delta)/\sqrt{V/\nu}\).
Moments exist only when \(\nu\) is large enough (mean: \(\nu>1\), variance: \(\nu>2\), etc.).
Sampling is easy from first principles with NumPy (normal + chi-square).
scipy.stats.nctprovides stable numerical evaluation (pdf,cdf) and estimation (fit).